Picture for Ming-Hsuan Yang

Ming-Hsuan Yang

Reasmory: 3D Reconstruction as Explicit Memory for VLMs Spatial Reasoning

Add code
May 31, 2026
Viaarxiv icon

CV-Arena: An Open Benchmark for Instructional Computer Vision Problem Solving with Human-AI Collaborative Preferences

Add code
May 30, 2026
Viaarxiv icon

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

Add code
May 21, 2026
Viaarxiv icon

GeoWeaver: Grounding Visual Tokens with Geometric Evidence before Scene Reasoning

Add code
May 21, 2026
Viaarxiv icon

SAMOFT: Robust Multi-Object Tracking via Region and Flow

Add code
May 10, 2026
Viaarxiv icon

AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion

Add code
May 04, 2026
Viaarxiv icon

Evolution of Video Generative Foundations

Add code
Apr 07, 2026
Viaarxiv icon

Interactive Tracking: A Human-in-the-Loop Paradigm with Memory-Augmented Adaptation

Add code
Apr 02, 2026
Viaarxiv icon

Finding Distributed Object-Centric Properties in Self-Supervised Transformers

Add code
Mar 27, 2026
Viaarxiv icon

LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs

Add code
Mar 19, 2026
Viaarxiv icon